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Abstract 

In this article we are interested in the density of small linear structures (e.g. arithmetic 
progressions) in subsets A of the group F™. It is possible to express these densities as certain 
analytic averages involving 1^4, the indicator function of A. In the higher-order Fourier analytic 
approach, the function 1^ is decomposed as a sum fa + fa where fa is structured in the sense 
that it has a simple higher-order Fourier expansion, and fa is pseudorandom in the sense that 
the fcth Gowers uniformity norm of fa, denoted by ||/2||j/*, is small for a proper value of k. 

For a given linear structure, we find the smallest degree of uniformity k such that assuming 
that ||/2||[/ fc is sufficiently small, it is possible to discard fa and replace 1a with fa, affecting the 
corresponding analytic average only negligibly. Previously, Gowers and Wolf solved this problem 
for the case where fa is a constant function. Furthermore, our main result solves Problem 7.6 
in [W. T. Gowers and J. Wolf. Linear forms and higher-degree uniformity for functions on F™. 
Geom. Fund. Anal., 21(l):36-69, 2011] regarding the analytic averages that involve more than 
one subset of F™. 
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1 Introduction 



In additive combinatorics one is often interested in the density of small linear structures (e.g. 
arithmetic progressions) in subsets of Abelian groups. It is possible to express these densities as 
certain analytic averages. For example, consider a subset A of a finite Abelian group G. Then the 
density of the /c-term arithmetic progressions in A is given by 

E [l A {X)l A (X + Y)... l A (X + (k - l)Y)} , (1) 

where X and Y are independent random variables taking values in G uniformly at random, and 1 A 
is the indicator function of A. More generally, one is often interested in analyzing 



E 



1a M,iXi I ... i A 1 \ m ,iXi 



(2) 



where X%, . . . ,Xk are independent random variables taking values in G uniformly at random, A* j 
are constants, and iCG. Analyzing averages of this type and understanding the relations between 
them is the core of many problems and results in additive combinatorics and analytic number theory, 
and there are theories which are developed for this purpose. The theory of uniformity, initiated 
by the proof of Szemeredi's theorem [13], plays an important role in this area, and it was a major 
breakthrough when Gowers [2] introduced a new notion of uniformity in a Fourier-analytic proof 
for Szemeredi's theorem. 

Gowers' work initiated an extension of the classical Fourier analysis, called higher-order Fourier 
analysis of Abelian groups. In this paper we are only interested in the case where the group is F™, 
where p is a fixed prime and n is large. In the classical Fourier-analysis of F™ , a function is expressed 
as a linear combination of the characters of F™, which are exponentials of linear polynomials; that is 

for a G F" the corresponding character is defined as Xa{x) = e p (^iLi a i x i)i where e p (m) := e~ m 
for m 6 F p . In higher-order Fourier analysis, the linear polynomials are replaced by higher degree 
polynomials, and one would like to express a function / : F™ — > C as a linear combination of the 
functions e p (P), where P is a polynomial of a certain degree. 

Higher-order Fourier expansions are extremely useful in studying averages that are defined 
through linear structures. To analyze the average in ([3]), one usually decomposes the function 1 A 
as /1 + /2, where f\ has a simple higher-order Fourier expansion, while fi is "quasirandom" meaning 
that it shares certain properties with a random function, and can be discarded as random noise. 
More precisely, ||/i||oo < 1 and there is a small constant C such that f\ = Yli=i c i e p{Pi) where q 
are constants and Pi are low degree polynomials, and fi is quasirandom in the sense that for some 
proper constant k, its k-th Gowers uniformity norm ||/2||[/* is small. 

The U k norms increase as k increases, and thus the condition that Hi^Hc/fc is small becomes 
stronger. Therefore a question arises naturally: Given the average ([2]), what is the smallest k such 
that under the assumption that ||/2||[/fc is sufficiently small in a decomposition 1 A = f\ + /2, one 
can discard /2, affecting the average only negligibly? This question was answered by Gowers and 
Wolf [5j in the case that fi is a constant function, provided that the field size p is not too small. 
In this work we extend their result to the case where /1 is an arbitrary bounded function. 

More concretely, a linear form L = (Ai, . . . , A&) S F^ maps every x = (aci, . . . , x^) £ (F™) fc to 

-^( x ) = Y^l=i e ^p- ® denote the complex unit disk {z G C : \z\ < 1}. Gowers and Wolf [4] 
defined the true complexity of a system of linear forms C = {L±, . . . , L m } as the minimal d > 1 such 
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that the following holds: for every e > 0, there exists 5 > such that if / : F™ 
with ||/ -E[/] 11^*4-1 < 5, then 



is a function 



E 



Xe(F»)* 



n/(^( x )) 



i=l 



E[/f 



< e. 



That is, as long as ||/ — E[/]||^d+i is small enough, we can approximate / by the constant function 
E[/], affecting the average only negligibly. 

Gowers and Wolf [5] fully characterized the true complexity of systems of linear formd3- Let 
L = (Ai, . . . , Afc) € Fp be a linear form in k variables. The <i-th tensor power of L is given by 

L d = \ H\ ij :ii,...,i d e[k}) € ¥p d . 



Gower and Wolf [5] proved the following result, which characterizes the true complexity of a system 
of linear form simple linear algebraic condition. 



Informal Statement (Theorem 13.5ft . Provided that p is sufficiently large, the true complexity of a 
system C = {L\, . . . , L m } of linear forms is the minimal d > 1 such that Lf +l , . . . , Lf^ 1 are linearly 
independent. 

In this work, we show that the true complexity in fact allows to approximate / by any other 
bounded function g, as long as ||/ — ffdi/d+i is small. This removes the requirement that g is a 
constant function, which was present in the work of Gowers and Wolf. We already mentioned that 
any bounded function / can be decomposed as / = fa + fa where f\ is "structured" and fa is 
"quasirandom" (see Theorem 14.6ft : our result thus shows that we can approximate / by fa without 
affecting averages significantly. In the context of sets, Gowers and Wolf's result allows one to handle 
only uniform sets A, for which 1a — E[l^] is pseudorandom, while our result allows one to handle 
all sets. 



Informal Statement (Theorem 13.8ft . Let C = {L\, . . . ,L m } be a system of linear forms of true 
complexity d. If p is sufficiently large, then for every e > 0, there exists 5 > such that the 
following holds. Let /, g : F? — >• B be functions such that \\f — g\\ijd+i < 5. Then 



E 



xe(Fg)* 



n/(^( x )) 



E 



xe(F«)* 



n^(x)) 



i=l 



< £. 



More generally, one may consider averages over several functions. The average in ([2]) is the 
probability that for every j € {l,...,m}, the linear combination Yli=i^j,i-^-i belongs to A. A 
more general case is the "off-diagonal" case, where instead of one subset A there are m subsets 
Ai, . . . , A m C Fp, and one is interested in estimating the probability that for every j 6 {1, . . . , m}, 

we have Yli=i £ Ar Similar to the diagonal case, this can be expressed as an analytic 

average, and then one can decompose each function into structured and pseudorandom parts 
1^4 4 = gi + hi, and once again the question arises of what level of uniformity suffices for discarding 
the pseudorandom parts without affecting the average significantly. Similar to the diagonal case, 
Gowers and Wolf [5] resolved this problem when all gi are constant functions. In this work we 
extend also the off-diagonal case to the general case where gi can be arbitrary bounded functions. 



1 Their result in fact requires the field size p not to be too small. Our results share the same requirement. 
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Informal Statement (Theorem 13. 8 p . Let C = {Lx, . . . ,L m } be a system of linear forms of true 
complexity d. If p is sufficiently large, then for every e > 0, there exists 5 > such that the 



following holds. Let fi,g% ■ F" 



i € [m] be functions such that \\fi — gi\\u<i+i < 6. Then 



E 



xe(F™)* 



i=l 



E 



xe(F™) A 



n#(M x )) 



i=i 



< e. 



It turns out that a more general phenomena holds, and all of the results above are immediate 
corollaries of the following theorem, which is the main technical contribution of this work. We show 
that in order to bound these averages, it suffices to have a single index i such that L^ +l is linearly 
independent of {L^ +l : j ^ i} and that ||/j||[/<i+i is small. This in particular resolves a conjecture 
of Gowers and Wolf [5]. 

Informal Statement (Theorem 13. 9p . Let C 



■ {Li, . . . , L m } be a system of linear forms. Assume 
i <,o iwk u > »K >ti>w o^u,/^ uj ^ 2 , . . . ,L^ X . If p is sufficiently large, then for every e > 0, 
there exists 5 > swc/i £/ia£ /or any functions fi, . . . , f m : F" — > B mi/i — ^ aue 



i/iai Lf +1 is not in t/ie linear span of L d+> 



E- 



XG(F^)* 



fl/i(^(X)) 



.j=i 



< e. 



So far, we have only discussed which conditions allow discarding the pseudorandom terms. Note 
that after removing those terms, one arrives at an average of the form 



E 




• • • fm I ^ \r, 



i.Xi 



(3) 



where each fj satisfies ||/j||oo < 1 an d has a simple higher-order Fourier expansion. For these 
expansions to be useful, one needs some kind of orthogonality or at least an approximation of it. 
The works of Green and Tao [7] and Kaufman and Lovett [TTJ provide an approximate orthogo- 
nality that can be used to analyze averages such as ExeF n [/l(-^0 . . . f m (X)] in a straightforward 
manner when proper higher-order Fourier expansions of fx, . . . , f m are known. However, it is not a 
priori clear that these results can be applied to analyze more general averages of the form ([3]). In 
Lemmas 15.11 15.21 and 15.31 we prove extensions of the results of Green and Tao [7] that are applicable 
to such general averages. These extensions allow us to approximate ([6]) with a simple formula in 
terms of the higher-order Fourier coefficients of fx, ■ ■ ■ , f m - 

Lemmas 15. 11 15.2 1 and 15.31 are quite useful, and in fact the proof of our main result, Theorem 13.91 



heavily relies on them. We also apply these lemmas to prove an invariance result (Proposition I5.5| ). 
which is one of the key tools in our subsequent paper [10] which studies correlation testing for affine 
invariant properties on F™. 

In the setting of functions on Zjv, recently Green and Tao [8] established similar results and 
characterized the true complexity of systems of linear forms. 



Paper organization We give some basic definitions and notation in Section [2 We discuss the 
complexity of systems of linear forms, and formally state our main theorems in Section [3l We give 
an overview of higher-order Fourier analysis in Section [H We prove a strong orthogonality result 
in Section [5j We then use these to prove our main result, Theorem 13.91 in Section [6j We conclude 
with some open problems in Section [7J 
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2 Definitions and notations 



For a natural number k, denote [k] := {1, . . . , k}. The complex unit disk is denoted by ID = {z G 
C : \z\ < 1}. We will usually use the lower English letters x,y,z to denote elements of F™. For 
x G F™, and i G [n], x(i) denotes the i-th coordinate of x, i.e. x = (x(l), . . . ,x(n)). We frequently 
need to work with the elements of (F™) fc , which we regard as vectors with k coordinates. These 
elements are denoted with bold font e.g. x = (x±, . . . , x&) G (F™) fc . Capital letters X, Y, etc are 

2tt» 

used to denote random variables. For an element m G F p , we use the notation e p (m) := e p . We 
denote by f,g, h functions from F™ to C. We denote n-variate polynomials over F™ by P, Q. 
The bias of a function / : F™ — > C is defined to be the quantity 

bias(/):=|E XeF? [/(X)]|. (4) 
The inner product of two functions /, g : F™ — >• C is defined as 



(f,g) :=Exewn[f(X)g(X)}. (5) 

A linear form in k variables is a vector L = (Ai, . . . , Xk) G F^ regarded as a linear function from 
y fe to V, for every vector space V over F p : If x = {x\, . . . , x^) G Y fc , then -L(x) := AiXi + . . . + XkXk- 
A system of m linear forms in k variables is a finite set C = {Li, . . . , L m } of distinct linear forms, 
each in k variables. For a function / : F^ — >■ C, and a system of linear forms C = {L\, . . . , L m } in 
k variables, define 



tdf) := E 



(6) 



n\k 



where X is a random variable taking values uniformly in (F^) 



Definition 2.1 (Homogeneous linear forms). A system of linear forms C = {L\, . . . , L m } in k 
variables is called homogeneous if for a uniform random variable X G (¥p) k , and every fixed 
c G F™, (Li(X), . . . , L m (X)) has the same distribution as (Li(X) + c, . . . , L m (X) + c). 

We wish to identify two systems of linear forms Co = {Li, . . . , L m } in k^ variables, and L\ = 
{L±, . . . , L' m } in k\ variables if after possibly renumbering the linear forms, (Li(X), . . . , L m (X)) 
has the same distribution as (L^(Y), . . . , L' m {Y)) where X and Y are uniform random variables 
taking values in (F™) fc ° and (F") fcl , respectively. Note that the distribution of (£i(X), . . . , L m (X)) 
depends exactly on the linear dependencies between L\,. . . ,L m , and two systems of linear forms 
lead to the same distributions if and only if they have the same linear dependencies. 

Definition 2.2 (Isomorphic linear forms). Two systems of linear forms C$ and L\ are isomorphic 
if and only if there exists a bijection from Cq to L\ that can be extended to an invertible linear 
transformation T : span(£o) —> span(£i). 

Note that if C = {L\,...,L m } is a homogeneous system of linear forms, then 
(Li(X), . . . , L m (X)) has the same distribution as (Li(X) + Y, . . . , L m (X) + Y), where Y is a uni- 
form random variable taking values in F^ and is independent of X. We conclude with the following 
trivial observation. 
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Observation 2.3. Every homogeneous system of linear forms is isomorphic to a system of linear 
forms in which there is a variable that appears with coefficient exactly one in every linear form. 



Next we define the Gowers uniformity norms. They are defined in the more general setting of 
arbitrary finite Abelian groups. 



Definition 2.4 (Gowers uniformity norms). Let G be a finite Abelian group and f : G 
an integer k >1, the k-th Gowers norm of f, denoted \\f\\jjk is defined by 



l/li 



E 



SC[k] 



k-\S\ 



f 



X + Y,Yi 



ies 



C. For 



(7) 



where C denotes the complex conjugation operator, and X, Y\ , . 
ables taking values in G uniformly at random. 



, Yfc are independent random vari- 



In this article we are only interested in the case where G = F™. These norms were first defined 
in [2] in the case where G is the group 7L^. Note that ||/||[/i = |E[/(X)]|, and thus || • is a 
semi-norm rather than a norm. The facts that the right-hand side of ([7|) is always nonnegative, 
and that for k > 1, || • \\yk is actually a norm are easy to prove, but certainly not trivial (see |2] for 
a proof). 



3 Complexity of a system of linear forms 

Let C = {L\, . . . , L m } be a system of linear forms in k variables. Note that if A C F? and 1a '■ 
— > {0, 1} is the indicator function of A, then tc{lA) is the probability that Li(X), . . . , L m (X) all 
fall in A, where X £ (F™) fc is uniformly chosen. Roughly speaking, we say A C F^ is pseudorandom 
with regards to C if 




that is if the probability that all Li(X), . . . , L m (X) fall in A is close to what we would expect if A 
was a random subset of F™ of size |^4|. Let a = \A\/p n be the density of A, and define / := 1a — a. 
We have 

tc(U) = t c (a + f) = a m + Yl a m ~ lSlt {L^S}(f)- 

SC[ m ],S^9 

So, a sufficient condition for A to be pseudorandom with regards to C is that t{ L .. i€S }(f) rj for 
all nonempty subsets S C [m]. Green and Tao [9] showed that a sufficient condition for this to 
occur is that ||/||{7«+i is small enough, where s is the Cauchy-Schwarz complexity of the system of 
linear forms. 

Definition 3.1 (Cauchy-Schwarz complexity). Let C = {Li, . . . , L m } be a system of linear forms. 
The Cauchy-Schwarz complexity of C is the minimal s such that the following holds. For every 
1 < i < m, we can partition {-^j}je[m]\{i} s + 1 subsets, such that Li does not belong to the 
linear span of each such subset. 

The reason for the term Cauchy-Schwarz complexity is the following lemma due to Green and 
Tao [9], whose proof is based on a clever iterative application of the Cauchy-Schwarz inequality. 



7 



Lemma 3.2 (|9j). Let fi, . . . , f m : ¥ p — > D. Let C = {L±, . . . , L m } be a system of m linear forms 
in k variables of Cauchy-Schwarz complexity s. Then 



E 



Xg(Fj)* 



Li=l 



< mi? \\fi\\us+ 1 - 

Ki<m 



Note that the Cauchy-Schwarz complexity of any system of m linear forms in which any two 
linear forms are linearly independent (i.e. one is not a multiple of the other) is at most m — 2, since 
we can always partition {-^j}je[ m ]\{i} into the m — 1 singleton subsets. 

The following is an immediate corollary of Lemma [ 



Corollary 3.3. Let C = {Li, . . . , L m } be a system of linear forms in k variables of Cauchy-Schwarz 
complexity s. Let fi,gi : F" — > D be functions for 1 < i < m. Assume that — <7i||[/s+i < for 
all 1 < i < m. Then 



E x 



where X € (E™) fc is uniform. 



f[fi(Li(X.)) 



i=l 



E-> 



n^(^(x)) 



i=l 



In particular, if A C F" of size |^4| = ap n satisfies ||1a — ~ 0, then tc(^A) 



a 



3.1 The true complexity 

The Cauchy-Schwarz complexity of C gives an upper bound on s, such that if \\1a — ct\\jjs+i is 
small enough, then A is pseudorandom with regards to C. Gowers and Wolf 0] defined the true 
complexity of a system of linear forms as the minimal s such that the above condition holds for all 
sets A. 

Definition 3.4 (True complexity [4]). Let C = {L%, . . . ,L m } be a system of linear forms over F p . 
The true complexity of C is the smallest d € N with the following property. For every e > 0, there 
exists 5 > such that if f : F™ — >■ D satisfies \\f\\u d + 1 ^ then 

\tc(f)\<e. 

An obvious bound on the true complexity is the Cauchy-Schwarz complexity of the system. 
However, there are cases where this is not tight. Gowers and Wolf [5] characterized the true 
complexity of systems of linear forms, assuming the field is not too small. For a linear form L € F^, 

let L d £ ¥p d be the <ith tensor power of L. That is, if L = (\%, . . . , then 

L d = yj>, :h,...,i d e[k]j € ¥p d . 

Theorem 3.5 (Characterization of the true complexity of linear systems, Theorem 6.1 in [5]). Let 
C = {L\, . . . , L m } be a system of linear forms over F^ of Cauchy-Schwarz complexity s < p. The 
true complexity of £ is the minimal d such that Lf +1 , . . . jL^ 1 are linearly independent over ¥ p . 



S 



A natural generalization is to allow for multiple sets. Let A\, . . . , A m C F™ be sets of densities 



a m . Let C = {Li, . . . , L m } be a system of linear forms over ¥ p . We say A\, 
pseudorandom with respect to Li, 



arc 



1 L m if 



a r , 



Pr xe(F£) fe [ L i( x ) S Ai, . . . ,L m (X) € yl m ] « ai • 

Analogously to the case of a single set, let /j = 1^ — on. Then a sufficient condition is that for all 
nonempty subsets S C [m], we have 



E 



n^(^( x )) 



ies 



0. 



In [5] , Gowers and Wolf showed that if C has true complexity d and 
small enough, then this stronger condition also holds. 



iiWjjd+i. 



m\\u d+1 



arc 



Theorem 3.6 (Theorem 7.2 in [5]). Let C = {Li, . . . , L m } be a system of linear forms over F™ of 
Cauchy-Schwarz complexity s < p and true complexity d. Then for every e > 0, there exists 5 > 
such that the following holds. Let fi, . . . , f m : F" — > D be functions such that \\fi\\jjd+i < 6, for all 
1 < i < m. Then for all nonempty subsets S C [m] we have 



E 



xe(F™)* 



n^(^( x )) 



< £. 



In particular, Gowers and Wolf used this to derive the following corollary. 

Corollary 3.7 (Theorem 7.1 in [5]). Let C = {L\, . . . ,L m } be a system of linear forms of true 
complexity d and Cauchy-Schwarz complexity at most p. Then for every e > 0, there exists 5 > 



such that the following holds. Let fi 
5. Then 



P 



i for 1 < i < m be functions such that 



-E 



Ud+i 



< 



< £. 



Note that Corollary 13.71 says if \\fi — E[/j] H^d+i is small for all i € [m], then in the average 
Ex [n^=i fi(Li(X.))] one can replace the functions fi with their expected values E[/J causing only 
a small error of e. In other words if in the decomposition fi = E[/j] + (fi — E [/»]), the part (fi —E[fi\) 
is sufficiently pseudorandom, then it is possible to discard it. As we shall see in Section 14.21 for 
every function fi : F™ — > D, it is possible to find a "structured" function gi, such that fi — gi is 
pseudorandom in the sense that gillt/d+i can be made arbitrarily small. However, in the general 
case the function gi will not necessarily be the constant function E[/j]. Hence it is important to 
obtain a version of Corollary 13.71 that can be applied in this general situation. We achieve this in 
the following theorem, which qualitatively improves both Corollaries 13.31 and 13.71 

Theorem 3.8. Let C = {Xi, . . . ,L m } be a system of linear forms of true complexity d and Cauchy- 
Schwarz complexity at most p. Then for every e > 0, there exists 5 > such that the following 
holds. Let fi,gi : F? — > B for 1 < i < m be functions such that — gi\\jjd+± < 5. Then 



E^ 



i=l 



E, 



i=l 



where X € (F") fc is uniform. 
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In fact, we prove a stronger result, from which Theorem 13.81 follows immediately. As Gowers 
and Wolf [5] correctly conjectured, in Theorem 13.61 the condition that the true complexity of C is 
at most d can be replaced by a much weaker condition. It suffices to assume that Lf +1 linearly 
independent of L^ +1 , . . . , L 1 ^ 1 . In fact it turns out that more is tru^l, and the condition that all of 
||/i||j7d+i, . . . , H/mllt/^ 1 are small also can be replaced by the weaker condition that only ||/i|j[/d+i 
is small. 

Theorem 3.9 (Main theorem). Let C = {L\, . . . , L m } be a system of linear forms of Cauchy- 
Schwarz complexity at most p. Let d > 0, and assume that Lf +1 is not in the linear span of 
L% +1 , . . . , L^ 1 . Then for every e > 0, there exists 5 > such that for any functions f\, . . . , f m : 
— >• B with \\fi\\i/d+i < 5, we have 



Theorem 13 . 91 improves both Lemma [3. 2 1 and Theorem l3.6l Let s < p denote the Cauchy-Schwarz 
complexity of C in Theorem 13.91 Lemma 13.21 requires the stronger condition ||/x||j7«+i < S, an d 
Theorem 13.61 requires two stronger conditions: that Lf +l , . . . , L^ 1 are linearly independent; and 
that all H/illj/d+i, . . . , ||/m||[/ d + 1 are bounded by 5. 

4 Higher-order Fourier analysis 

Although Fourier analysis is a powerful tool in arithmetic combinatorics, there are key questions 
that cannot be addressed by this method in its classical form. For example in 1953 Roth [112] used 
Fourier analysis to show that every dense subset of integers contains 3-term arithmetic progressions. 
For more than four decades generalizing Roth's Fourier-analytic proof remained an important 
unsolved problem until finally Gowers in [2] introduced an extension of the classical Fourier analysis, 
which enabled him to obtain such a generalization. The work of Gowers initiated a theory, which 
has now come to be known as higher-order Fourier analysis. Ever since several mathematicians 
contributed to major developments in this rapidly growing theory. 

This section has two purposes. One is to review the main results that form the foundations 
of the higher-order Fourier analysis. A second is to establish some new facts that enable us to 
deal with the averages tc conveniently by appealing to higher-order Fourier analysis. The work of 
Gowers and Wolf [3] plays a central role for us, and many ideas in the proofs and the new facts 
established in this section are hinted by their work. 

The characters of Fp are exponentials of linear polynomials; that is for a G F™, the correspond- 
ing character is defined as Xa(%) = ^p(Y17=i a i x i)- ^ n higher-order Fourier analysis, the linear 
polynomials ^ otiXi are replaced by higher degree polynomials, and one would like to express a 
function / : F™ — > C as a linear combination of the functions e p (P), where P is a polynomial of a 
certain degree. 

Consider a function / : F™ — > C, and a system of linear forms C = {Li, . . . ,L m }. The basic 
properties of characters enable us to express tc(f) as a simple formula in terms of the Fourier 

2 Gowers and Wolf required that Lf +1 is linearly independent of L% +1 , ■ ■ ■ , Lf^ , and that all 
. . . , ll/mll^d+i will be bounded by S. 



rn 
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coefficients of /. Indeed, if / := X^agF" f( a )Xa is the Fourier expansion of /, then it is easy to see 
that 

^(/)=^/(ai).../(« m ), (8) 

where the sum is over all at, . . . , a m € F™ satisfying YaLi a i®^i = 0- The tools that we develop in 
this section enables us to obtain simple formulas similar to when Fourier expansion is replaced 
by a proper higher-order Fourier expansion. 

4.1 Inverse theorems for Gowers uniformity norms 

We start with some basic definitions. 

Polynomials: Consider a function f : F™ — > ¥ p . For an element y € F™, define the derivative of f 
in the direction y as A y f(x) = f(x + y) — f(x). Inductively we define A yu ... iVk f = Aj /fe (Aj /lv .. j j /fe l f), 
for directions 6 F™. We say that f is a polynomial of degree at most d if for every 

yi, ... , y^+i G F p , we have Aj /lv .. j j /d+1 f = 0. The set of polynomials of degree at most d is a vector 
space over F p , which we denote by Poly d (Fp). It is easy to see that the set of monomials x 1 ^ . . . x l ™ 
where < i%, . . . , i n < p and YZj=i H — d f° rm a basis for Poly rf (Fp). So every polynomial P € 
Poly rf (Fp) is of the from P{x) := ^ Cj li ... j j n x 4 1 1 . . . x^ n , where the sum is over all 1 < i\, . . . , i n < p 
with X)i=i h — d, and Cj l! ... ! j n are elements of F p . The degree of a polynomial P : F™ — >• F p , denoted 
by deg(P), is the smallest d such that P 6 Poly d (F"). A polynomial P is called homogeneous if all 
monomials with non-zero coefficients in the expansion of P are of degree exactly deg(P). 

Phase Polynomials: For a function / : FL 1 — > C, and a direction y € F™ define the multiplicative 
derivative of / in the direction of y as A y f(x) = f(x + y)f(x). Inductively we define A yij ... )?/fc / = 
Ay k (A yir „ ) y k _ 1 f), for directions 2/1, • • • , 2/fc £ F™. A function / : F™ — >■ C is called a phase polynomial 
of degree at most d if for every 2/1, ... , y^+i G F p , we have ^yx,...,y d+1 f = 1- We denote the space of 
all phase polynomials of degree at most d over F™ by Prf(F"). Note that for every f : F™ — > F p , and 
every y € F™, we have that 

Aj / e p (f) = ep(A v f). 

This shows that if f G Poly rf (F"), then e p (f) is a phase polynomial of degree at most d. The 
following simple lemma shows that the inverse is essentially true in high characteristics: 

Lemma 4.1 (Lemma 1.2 in |16|). Suppose that < d < p. Every f € P^F") is of the form 
f(x) = e p (6 + f(s)), /or some € R/Z, and f e Poly d (F£). 

When d > p, more complicated phase polynomials arise. Nevertheless obtaining a complete 
characterization is possible |14j . 

Now let us describe the relation between the phase polynomials and the Gowers norms. First 
note that one can express Gowers uniformity norms using multiplicative derivatives: 

A Yi> „.,yJ(X)] , 

where X, Y\ , . . . , Y)~ are independent random variables taking values in F™ uniformly. This for 
example shows that every phase polynomial g of degree at most d satisfies HffHi/d+i = 1. 
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Many basic properties of Gowers uniformity norms are implied by the Gowers-Cauchy-Schwarz 
inequality, which is first proved in [2] by iterated applications of the classical Cauchy-Schwarz 
inequality. 

Lemma 4.2 (Gowers-Cauchy-Schwarz). Let G be a finite Abelian group, and consider a family of 
functions fs'G—tC, where S C [k]. Then 



E 



SC[k] i£S 



< n \\fs\\u», (9) 

SC[k] 



where X, Y\ , . . . , are independent random variables taking values in G uniformly at random. 

A simple application of Lemma 14.21 is the following. Consider an arbitrary function / : G — > C. 
Setting /0 := / and fs ■= 1 for every S ^ in Lemma l4T2l we obtain 

™< ll/b*. (10) 

Equation (|10p in particular shows that if /, g : F™ — > C, then one can bound their inner product by 
Gowers uniformity norms of fg: 

\(f,9)\<\\m\u*- (ii) 

Consider an arbitrary / : F? — > C and a phase polynomial g of degree at most d. Then for every 
yi,.. . ,y d+1 G Fp, we have 

Ay u ...,y d+1 (fg) = (Ay u ...,y d+ J)(Ay u ...,y d+i g) = A yi , . . . ,y d+ , f, 

which in turn implies that ||/g||;jd+i = ||/||{jd+i. We conclude that 

sup \(f,g)\ < H/b-i+L (12) 
gev d 

This provides us with a "direct theorem" for the U d+1 norm: If sup gg p d \ (f,g) \ > e, then > 
e. The following theorem provides the corresponding inverse theorem. 

Theorem 4.3 ([H [TS1 US]). Xet d be a positive integer. There exists a function 5 : (0, 1] — > (0, 1] 
such that for every f : F™ — > D, and £ > 0, 

• Direct theorem: // sup g& p d |(/, g)\ > e, then \\f\\ ud +i > e. 

• Inverse theorem: If \\f\\jjd+i > e, then sup ge -p d \ (f, g}\ > S(e). 

In the case of 1 < d < p, Theorem 14.31 is established by Bergelson, Tao, and Ziegler [Tj 116] . 
The case d > p is established only very recently by Tao and Ziegler [15J. In the range 1 < d < p, 
Lemma l4.1l shows that the phase polynomials of degree at most d can be described using polynomials 
of degree at most d. So Theorem 14.31 shows that in this case if H/Hj/d+i > £, then there exists a 
polynomial g : F™ — > ¥ p of degree at most d such that |(/, e p (g))[ > 5(e) > 0. 
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4.2 Decomposition theorems 

An important application of the inverse theorems is that they imply "decomposition theorems". 
Roughly speaking these results say that under appropriate conditions, a function / can be decom- 
posed as fx + fi, where fi is "structured" in some sense that enables one to handle it easily, while 
fi is "quasi-random" meaning that it shares certain properties with a random function, and can 
be discarded as random noise. They are discussed in this abstract form in [3j. In the following 
we will discuss decomposition theorems that follow from Theorem 14.31 but first we need to define 
polynomial factors on F™. 

Definition 4.4 (Polynomial factors). Let p be a fixed prime. Let Pi,...,Pq £ Poly rf (F"). The 
sigma- algebra on whose atoms are {x € F™ : P\(x) = a(l), . . . , Pc{ x ) = a (C)} f or oil a € F^ is 
called a polynomial factor of degree at most d and complexity at most C . 

Let B be a polynomial factor defined by Pi, ... , Po- For / : F™ — > C, the conditional expectation 
of / with respect to B, denoted E(f\B) : F™ — > C, is 

^■(f\B)( x ) = E {y&%:P 1 (y)=P 1 (x),...,Pc(y)=Pc(x)}[f(y)}- 

That is, M(f\B) is constant on every atom of B, and this constant is the average value that / 
attains on this atom. A function g : F™ — > C is B- measurable if it is constant on every atom of B. 
Equivalently, we can write g as g(x) = T(Pi(x), . . . , Pc(x)) for some function T : F^j — >■ C. The 
following claim is quite useful, although its proof is immediate and holds for every sigma-algebra. 

Observation 4.5. Let f : F" — )■ C. Let B be a polynomial factor defined by polynomials Pi, ... , Pq- 
Let (7 : Fp — )• C be any B-measurable function. Then 

(f,g) = (E(f\B),g). 

The following theorem that follows in a standard manner from Theorem 14.31 gives a simple 
decomposition theorem. 

Theorem 4.6 (Decomposition Theorem [6J). Let p be a fixed prime, < d < p be an integer, and 
e > 0. Given any function f : F" — > D, there exists a polynomial factor B of degree at most d and 
complexity at most C mayL (p,d,e) together with a decomposition 

f = h + h, 

where 

fi :=E(f\B) and \\f 2 \\u^i < e. 

We sketch the standard proof of Theorem 14.61 below, as we will need some extensions of it in 
this paper. For a full proof we refer the reader to [6]. 

Proof sketch. We create a sequence of polynomial factors Bi,B 2 , ... as follows. Let Bi be the trivial 
factor (i.e. K(f\Bi) is the constant function E[/]). Let g% = f — E(/|Z3j). If Ij^Hj/d+i < e we are 
done. Otherwise by Theorem 14.31 since ||<?i||oo < 2, there exists a polynomial Pi € Poly d (Fp) such 
that (gi,e p (Pi)) > 5(e). Let Bi + \ = BiU {Pi}. The key point is that one can show that 

Htfi+lll! < \\9i ~ (9i,ep(Pi))\\l < llSilll - <K e )- 
Thus, the process must stop after at most 1/S(e) steps. □ 
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Suppose that the factor B is defined by P\,...,Pc € Poly d (F"). Assume that fi(x) = 
T(Pi(x), . . . ,Pc(x)). Using the Fourier decomposition of T, we can express f\ as 

h{x)= r(7)e P (j2j(i)Pi(x)\ . (13) 

■yeFC \i=l / 

Note that for every 7 G Ylf=i li^Pii 00 ) ^ P°lydCjJ)- So gives an expansion for fx which 
is similar to the Fourier expansion, but instead of characters e p (^a(i)xj), we have exponential 

functions e p (Ylf=i tW-^H 3 *)) which have polynomials of degree d in the powers instead of linear 
functions. For this decomposition to be useful similar to the Fourier expansion, one needs some 
kind of orthogonality for the functions appearing in the expansion. 

Definition 4.7 (Bias). The bias of a polynomial P £ Poly rf (F") is defined as 

bias(P) := bias(e p (P)) = \E x ^[e p (P(X))]\. 

We shall refine the set of polynomials {Pi, . . . , Pc} to obtain a new set of polynomials with the 
desired "approximate orthogonality" properties. This will be achieved through the notion of the 
rank of a set of polynomials. 

Definition 4.8 (Rank). We say a set of polynomials V = {Pi, . . . ,Pt} is of rank greater than r, 
and denote this by rank(P) > r if the following holds. For any non-zero a = (a±, . . . ,at) € F* ; 
define P a (x) := Ylj=i ctjPj(x)- For d := max{deg(P ? ) : ay 7^ 0}, the polynomial P a cannot be 
expressed as a function of r polynomials of degree at most d — 1. More precisely, it is not possible 
to find r polynomials Qi, ■ ■ ■ ,Q r of degree at most d — 1, and a function T : F£ — > ¥ p such that 

P a (x) = T(Q 1 (x),...,Q r (x)). 

The rank of a single polynomial P is defined to be rank({P}). 

The rank of a polynomial factor is the rank of the set of polynomials defining it. The following 
lemma follows from the definition of the rank. For a proof see [7] . 

Lemma 4.9 (Making factors high-rank). Let r : N — >■ N be an arbitrary growth function. Then 
there is another function r : N — >■ N with the following property. Let B be a polynomial factor with 
complexity at most C. Then there is a refinement B' of B with complexity at most C < t(C) and 
rank at least r(C). 

The following theorem due to Kaufman and Lovett |llj connects the notion of the rank to the 
bias of a polynomial. It was proved first by Green and Tao [7] for the case d < p, and then extended 
by Kaufman and Lovett [11] for the general case. 

Theorem 4.10 (Regularity [llj). Fix p prime and d>l. There exists a function r v ^ : (0, 1] —> N 
such that the following holds. If P : F™ — > ¥ p is a polynomial of degree at most d with bias(P) > e, 
then rank(P) < r v ^(e). 

Combining Lemma 14.91 with Theorem 14.61 h is possible to obtain a strong decomposition theo- 
rem. 
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Theorem 4.11 (Strong Decomposition Theorem [6]). Let p be a fixed prime, < d < p be an 
integer, 5 > 0, and let r : N — > N be an arbitrary growth function. Given any function f : F™ — > D, 
there exists a decomposition 

f = h + f2, 

such that 

h:=E(f\B), \\h\\u^<6, 

where B is a polynomial factor of degree at most d, complexity C < C max (p, d, 5, r(-)), and rank at 
least r(C). 

We sketch the proof below. For a full proof we refer the reader to [6]. 

Proof sketch. The proof follows the same steps as the proof of Theorem 14,61 except that at each 
step, we regularize each polynomial factor Bi to obtain B'i, and set £>j +1 = B'i U {Pi}. The only 
new insight is that as B'i is a refinement of Bi we have 

||/-E(/|^)|| 2 <||/-E(/|^)|| 2 . 

□ 

Note that Theorem 14. 1 1 1 guarantees a strong approximate orthogonality. For every fixed function 
to : N — > N, by taking r(-) to be a sufficiently fast growing function, one can guarantee that the 
polynomials Pi, ... , Pc that define the factor B have the property 



E 



c 



vi=l 



< IMC), (14) 



for all nonzero 7 € . That is, the polynomials can be made "nearly orthogonal" to any required 
precision. 

The decomposition theorems stated to far referred to a single function. In this paper we require 
decomposition theorems which relate to several functions with a single polynomial factor. The 
proofs can be adapted in a straight-forward manner to prove the next result. 

Lemma 4.12 (Strong Decomposition Theorem - multiple functions). Let p be a fixed prime, < 
d < p and m be integers, let 5 > 0, and let r : N — >■ N be an arbitrary growth function. Given every 
set of functions /1, . . . , f m : F" — >■ D, there exists a decomposition of each fi as 

fi = hi + h'i, 

such that 

hi:=E(fi\B), \\K\\ ud+ i<6, 

where B is a polynomial factor of degree at most d, complexity C < C max (p, d, 8, m, r(-)) and rank 
at least r(C). Furthermore we can assume that B is defined by homogeneous polynomials. 
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5 Strong orthogonality 



Let 6 be a polynomial factor defined by polynomials P\, . . . , Pc, and let / : F™ — > D be a 
23-measurable function. We saw in (|13p that / can be expressed as a linear combination of 
e p (^ i=1 7(i)Pj(a;)), for 7 £ F^. Furthermore if we require the polynomials to be of high rank, 
then we obtain an approximate orthogonality as in (|14p . This approximate orthogonality is suf- 
ficient for analyzing averages such as E[/i(X)/2(X) . . . f m (X)], where fx, . . . , f m (not necessarily 
distinct) are all measurable with respect to B. However, it is not a priori clear how this or- 
thogonality can be used to deal with averages of the form E[/i(Li(X)) . . . / m (L m (X))], for linear 
forms Lx, ■ ■■ ,L m . The difficulty arises when one has to understand exponential averages such as 
E[e p (P(X + Y) — P(X) — P(Y))]. This average is 1 when P is a homogeneous polynomial of degree 
one, but it does not immediately follow from what we have said so far that it is small when P is of 
higher degree. In this section we develop the results needed to deal with such exponential averages. 

Consider a set of homogeneous polynomials {Pi, . . . , Pfc}, and a set of linear forms {Lx, . . . , L m }. 
We need to be able to analyze exponential averages of the form: 



E, 



EE a m p ^'(x)) 
=1 3=1 



where Aj j G F p . Lemma 15. II below shows that if {Pi, . . . , P&} are of sufficiently high rank, then it is 



either the case that ^i=i YljLi ^i,i-fU%( x )) = 0, which implies that the corresponding exponential 
average is exactly 1, or otherwise the exponential average is very small. Note that this is an 
"approximate" version of the case of characters. Namely if {Xyn ■ ■ ■ iXy k } are characters of F™, 

is either 1 or 0. In the case of polynomials of high rank, the "zero" 
1 number. 



then E 



niinr=ix^(x))_ 

case is approximated by a sma' 



Lemma 5.1. Fix p prime and d < p. Let {Lx, ■ ■ ■ ,L m } be a system of linear forms. Let V = 
{Pi, . . . ,Pfc} be a collection of homogeneous polynomials of degree at most d, such that rank(P) > 
r p,d( £ )- F° r every set of coefficients A = {Ajj G F p : i € [k],j G [m]}, and 

k m 

P A (x) ^^^A^P^x)), 

i=i j=i 

one of the following two cases holds: 

Pv = or bias(P\) < e. 

The proof of Lemma 15.11 is given in Section 15.11 Lemma 15.11 shows that in order to estimate 
bias(PA) for polynomials of high rank V = {Pi, . . . ,Pfc}, it suffices to determine whether Pa is 
identically or not. Our next observation, whose proof is given in Section [5. 1\ says that when the 
polynomials are homogeneous and linearly independent, then Pa = depends only on the set of 
the coefficients Ajj, the linear forms Lj, and the degrees of the polynomials involved in Pi, . . . ,P^., 
and not the particular choice of the polynomials. 

Lemma 5.2. Let {Lx, ■ ■ ■ , L m } be a system of linear forms over F^, Ajj G ¥ p for i G [k],j G [m], 
and dx, . ■ ■ ,df~ G [d]. Then one of the following two cases holds: 
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(i) For every collection of linearly independent homogeneous polynomials Pi,...,Pk of degrees 
di , . . . , dk : 

k m 
i=l j=l 

(ii) For every collection of linearly independent homogeneous polynomials Pi,...,Pf. of degrees 
di , . . . , dk : 

k m 
i=l j=l 

Sometimes one needs to deal with non-homogeneous polynomials. This for example is the case 
in our subsequent paper [10] about correlation testing for affine invariant properties on F™. The 
following lemma shows that if the set of the linear forms is homogeneous, then the case of the 
non-homogeneous polynomials reduces to the homogeneous polynomials. Let P : F™ — > F p be a 
polynomial of degree at most d. For 1 < I < d, let P® denote the homogeneous polynomial that 
is obtained from P by removing all the monomials whose degrees are not equal to I. 

Lemma 5.3. Let C = {Li, . . . ,L m } be a homogeneous system of linear forms. Furthermore, let 
P±, . . . , Pfc be polynomials of degrees d\,...,d^ such that ^P^' : i £ [fc]} are linearly independent 
over F p . Then for Ajj G F p; 

Eli ET=i KjPiiH*)) ^ o Eti Z?=i hjP^iLji*)) = o 

Using Lemma 15.31 one can derive the following analog of Lemma 15.11 for averages of nonhomo- 
geneous polynomials when the system of linear forms is homogeneous. 

Lemma 5.4. Fix p prime and d < p. Let {L±, . . . , L m } be a homogeneous system of linear forms. 
Let V = {Pi, ■ ■ ■ i Pk] be a collection of polynomials of degree at mostd, such that rank('P) > r Pj d(e)- 
For every set of coefficients A = {Ajj 6 F p : % G [k], j G [m]}, and 

k m 

P A (x) -^^^(^(x)), 

i=l j=l 

one of the following two cases holds: 

Pa = or bias(P\) < e. 

Lemmas 15. 1\ \5.2\ 15.31 and 15.41 show that when studying the averages defined by systems of linear 
forms, under some homogeneity conditions (either for polynomials or for system of linear forms), 
high rank polynomials of the same degree sequence behave in a similar manner. The following 
proposition captures this. 

Proposition 5.5 (An invariance result). Let V = {P%, , ■ ■ , Pfe}, Q = {Qi, ■ ■ ■ , Qk} be two col- 
lections of polynomials over F" of degree at most d < p such that deg(Pj) = deg(Qj) for every 
1 < i < k. Let C = {L±, . . . , L m } be a system of linear forms, and T : F p — > D be an arbitrary 
function. Define f,g:¥p^-0 by 

f(x) = T(P 1 (x),...,P k (x)) 
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and 

g(x) = T(Qi(x), . . . ,Qk(x)). 

Let r p c i : (0, 1] — > N be as given in Lemmas I5.il and Lemma \5.4\ Then for every e > if 
rank('P), rank(Q) > r p ^(e), we have 

\t c (f)-t c (g)\<2ep mk , 
provided that at least one of the following two conditions hold: 

(i) The polynomials Pi, . . . , Pj. and Qi, . . . , Qk are homogeneous. 

(ii) The system of linear forms C is homogeneous. 
Proof. The Fourier expansion of Y shows 



T(z(l),...,z 



(fc))= ^f( 7 )e p ^ 7 (,).^)) • 



We thus have 



H/(^(x))= £ f( 7l ; 

7l,...,7 m eF^ 



1=1 



and 



n^(x))= yi 



i=l 



ie[k],je[m] 



iie[fe]je[m] 



By Lemma [5.11 (under condition (i)) or Lemma [5.4l (under condition (ii)) we know that for every 

71,..., 7m G I* each one of the polynomials YUe[k] ,je[m] 7i(*) ■ p i( L j(*)) and EieM,ie[m]7i(0 • 
Qi(Lj{x)) is either zero, or has bias at most e. But Lemmas 15.21 and 15.31 show that under each one 
of the Conditions (i) or (ii), since deg (Pi) = deg(Qj), we have that 

i6[fe]j'e[m] ie[fe]je[m] 
Hence we have that 



E x 



ep( E 7i«-^(^i(X)) 

ie[fc],ie[m] 



E-j 



e p ( E ' ^( L i( X )) 

ie[A],ie[m] 



< 2e, 



and 



E x 



n/(^( x )) 



i=l 



i=l 



< 



2e- £ |f( 7l )|...|f( 7m )|. 



7l,...,7mGF* 



Since ||r||oo < 1, we conclude 



Ex 



.i=i 



II^( X )) 



□ 
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5.1 Proofs of Lemmas [57Q US HH and ET1 

We need to introduce several new notations in this section. We shall try to provide various 
examples to illustrate these concepts. 

Let P(x) be a homogeneous polynomial of degree d < p. Let B(xi, . . . , x^) be the sym- 
metric multi-linear form associated with P; that is P(x) = B(x,...,x) and B(x\, . . . ,Xd) = 
B(x ai , . . . ,x ad ), for every permutation a of [d\. 

Example 5.6. Consider a prime p > 5 and the polynomial P(x) := 6x(l)x(2) 2 G F p [x(l), . . . , x(n)]. 
Then 

B(xi,x 2 ,x 3 ) :=x 1 (l)x 2 (2) 2 + x 1 (2)x 2 (l) 2 + x 1 (l)x3(2) 2 + x 1 (2)x3(l) 2 + x 2 (l)x 3 (2) 2 + 2;2 (2)x3(l) 2 
is the symmetric multi-linear form associated with P. ■ 
If -L(x) = Yli=i c i x, i ls a linear form in k variables, then we have 

P ( L ( X )) = c il ...c id B{x hl ... 1 Xi d ). 

i!,...,i d e[k] 

For u = (ui,...,Ud) G [k] d , denote x u = (x Ul , . . . ,x Ud ). Let U d C [k] d be defined as U d = 
{(ui, . . . ,Ud) G [fc] 01 : u\ < u 2 < ■ ■ ■ < Ud}- For u G ?7 d , denote by ^d(u) the number of distinct 
permutations^ of (ui, . . . , Ud), and let Cd(u, L) := c Ul . . . c Ud . Since P is symmetric, we have 

P(L(x)) = ^ * d (u)ci(u,L)fl(x u ). (15) 

Note that £^( u ) depends only on u and Cd(u, L) depends only on the linear form L and u 6 [k] d . 

Example 5.7. Suppose that p > 5 is a prime and is a homogenous polynomial of degree 3. 
Let .B(xi, x 2 , X3) be the symmetric multi-linear form associated with P. Let L(x) = x\ + jx 2 be a 
linear form in two variables where j G F p is a constant. In this case (I15p becomes 

P(xi+jx 2 ) = B(xi + jx 2 ,£i + jx 2 , xi + jx 2 ) 

= P(xi,xi,xi) + jB(xi,xi,x 2 ) + ... + j 3 B(x 2 ,x 2 ,x 2 ) 

= B(x 1 ,x 1 ,x 1 ) + 3jB(x 1 ,x 1 ,x 2 ) + 3j 2 P(xi,x 2 ,x 2 ) + j 3 B(x 2 , x 2 , x 2 ). 

Note that for example the coefficient of B(xi,x 2 ,x 2 ) is 3j 2 which is consistent with (fT5j) as 
3! 

1!2! 

We need the following claim. 



£3(1, 2, 2) = JL = 3 and c 3 ((l, 2, 2), L) =lxjxj= j 2 . 



Claim 5.8. Let B be a homogeneous multi-linear form over ¥ p of degree d < p. Consider a linear 
combination 

Q(x) = V c u P(x u ) 



ueu d 



where not all the coefficients c u are zero. Then there exist 01, . . . ,0^ G F p and a G F p \ {0} suc/i 
i/iat /or every w G F™ 

Q{a\w, . . . , afcw) = aB(w, . . . ,w). 



3 If the multiplicities of the elements of a multi-set are then the number of distinct permutations of those 



elements is 



^ 1 +...+i e y. 
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Proof. Consider x = (a\w, . . . , a^w). As B is multi-linear, we have 

B(w, ...,w), 



for every u € J/^. Let a = (a l5 . . . , a&) and let a u denote the monomial a u = Ylf =1 a Ui . We thus 
have 

Q(aiw, a k w) = ^ c u a u ■ B(w, ...,w). 
\ueu d ) 

Consider g(a±, . . . , a^) = Ylueu d c u« u - This is a polynomial in oj, . . . , ojt which is not identically 
zero, as distinct u 6 £/ correspond to distinct monomials a u . Hence there exists some assignment 
for a for which a := g(a) ^ 0. □ 

Consider a set of linearly independent homogeneous polynomials {Pi , . . . , Pfc}, a system of linear 
forms {Li, . . . , L m } and some coefficients Ajj € F p where i G [m], j G [A;]. Let Bi be the symmetric 
multi- linear form associated with p. Denoting di := deg(Pj) and using the notation of (|15p . we 
thus have 

km km 

p A (x) = E^E A « p ^( x )) = EE A ^ E ^w^cu^^b^xu) 

i=l j=l i=l j=l uGC/ d i 

= E E ^c^w. ( 16 ) 

»=i uec/ d i 

where 

3=1 

Note that the coefficients bf*(u) do not depend on the specific set of polynomials Pi, . . . ,Pfc. 

Example 5.9. Consider homogenous polynomials Pi(x) of degree 1 and P 2 (x) of degree 3, and 
linear forms Lj = x\ + jx 2 for j = 1,2,3,4. Let B\(xi) and B 2 (x\, x 2 , x$) be the symmetric 
multi-linear form associated with Pi and P2 respectively. We have 

P\{x\ + jx 2 ) = Bx{xx) + jB 1 (x 2 ), 

and as we saw in Example 15.71 

P2O1 +3x2) = B 2 (x 1 ,xi,x 1 ) + 3jB 2 (xi,xi,x 2 ) + 3j 2 B 2 (xi,X2,X2) + j 3 B 2 (x 2 , x 2 , x 2 ). 
Now consider some coefficients Ajj € ¥ p for i € [2] and j € [4]. Then 
2 4 / 4 \ 

p a( x ) = EE^ p <( £ iW) = E (^i(^i) +^1(^2)) + 
t=i i=i \j=i j 

^ A 2 j (P 2 (xi,xi,xi) + 3jP 2 (xi,xi,x 2 ) + 3j 2 P 2 (xi,x 2 ,x 2 ) + j 3 B 2 (x2,X2,x 2 )) ■ 
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Let us investigate the coefficient of the particular term B%(xi,X2,X2) i n this expression. This 
coefficient is equal to 



A 2li 3j 2 = 3 M, 3 f = 4(1, 2, 2) A 2j c 3 ((l, 2, 2), Lj) = b 3 2 (l, 2, 2), 

as in (HU). ■ 

Among Lemmas 15.11 15.21 15.31 and 15.41 first we prove Lemma 15.21 which has the simplest proof. 

Proof of Lemma \5.SX We will show that -Pa( x ) = if and only if b^(u) = 0, for all i G [k] and 
u G ?7 d \ Let Bi be a polynomial appearing in Pa with a nonzero coefficient, that is &^°(u) ^ 
for some u G f7^o in (|16p . Consider any assignment of the form x = (aiw, . . . , a^w). We have that 

Bi(x u ) = a u P;(u;, ...,«;) = a u p(u;). 

Hence we get that 

fc 

P\(aiw, . . . ,a k w) = ^2atiPi(w), (17) 
i=i 

where on = Ylueu d i aU ^( u )- Applying Claim 15.81 there exists a choice of oi, . . . ,a k , such that 
aj ^ 0, and then the linear independence of {Pi, . . . , P^} shows that P\ 0. □ 

The proof of Lemma 15.21 shows that P\( x ) ^ if and only if fe^ l (u) ^ for some i and u. Next 
we prove Lemma 15.11 where we show that in this case under the stronger condition of high rank 
bias(P\) is small. 

Proof of Lemma \5.1[ Suppose that Pv(x) ^ so that bf* (u) ^ 0, for some i G [k] and u G U di . 
Let di < d be the largest degree such that b^° (u) 7^ 0, for some u G C/ di o . 

Assume for contradiction that bias(P\) > s. By the regularity theorem for polynomials (Theo- 
rem HTTU]) we get that P\( x ) can be expressed as a function of r < T P) d^(e) < r Pi d(s) polynomials 
of degree at most di — 1. We will show that this implies rank(P) < r. We know that Pv( x ) can 
be expressed as a function of at most r polynomials of degree at most di — 1. This continues to 
hold under any assignment x = {a\w, . . . , a k w). That is 



P\{aiw, . . . ,a k w) = y^ajP^w), 

8=1 

is a function of at most r polynomials of degree at most dj — 1, where aj = X^uec/^ aU ^( u ) • By 
Claim ESI there exists a choice of a±, . . . , a& such that aj 7^ 0, and this shows that rank(P) < r. □ 

Next we prove Lemma 15 . 31 where we deal with the case that the polynomials are not necessarily 
homogeneous, but instead the system of linear forms {Li, . . . , L m } is homogeneous. 

Proof of Lemma 15.31 It follows from Observation 12.31 that by a change of variables we can assume 
that L±, . . . , L m are linear forms over IF™ in s variables x\, . . . , x s , and x\ appears with coefficient 
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1 in all linear forms. For 1 < i < k, let Bj , ... , B i i be the symmetric multilinear forms associated 
with ■ • • , Pj , respectively. Then ([TBI) must be replaced by 

km k m di 

p A (x) = EE a « p ^( x )) = EEE a m ^(uMu^sftxu) 

i=l j=l i=l j = l i=l uGC/ ; 

A; 

i=l i=l ue c/i 



where 



b\(u) :=£,(u)5^AijQ(u,Lj-) 



Suppose that P\( x ) ^ 0. We claim that in this case there exists an i € [/c] and u € £/ di such that 
b dt (\i) 7^ 0, and this establishes the lemma. In order to prove this claim it suffices to show that if 
b dt (\i) = 0, for every u £ C/ rf % then 6'(u) = for every < t < di and u € f/*. Let otherwise t be 
the largest integer such that 6*(u) = ^t(u) X]j=i \,j c t( u i A/) 7^ 0, for some u = {u\, . . . , ut) € U t . 
Consider u' = (l,u%, . . . ,ut) £ U t+l . Since x\ appears with coefficient 1 in every Lj, we have that 
q + i(u', Lj) = q(u, Lj). Also note that since di < p, and p is a prime, £i(u) 7^ for every 1 < I < di 
and u £ U l . Hence we conclude that 



9 ( '\ 

6* +1 (u') = wu') E A M c t(u, L ;) = J 7rT ifo *( u ) ^ 0. 

which contradicts the maximality of t. □ 

We conclude with the proof of Lemma 15.41 
Proof of Lemma \5.4\ The proof is nearly identical to the proof of Lemma 15.11 Let Pi , . . . , P& be 



(£) 

polynomials of degrees d\,...,dk- Decompose each polynomial to homogeneous parts Pj for 
1 < I < di, and let Bf denote the corresponding multi- linear polynomial. As in the proof of 
Lemma 15.14 let i be maximal such that b\ 7^ for some i G [k]. As the system of linear forms is 
homogeneous we have by Lemma 15.31 that also 7^ 0, hence £ = di. The proof now continues 
exactly as in Lemma 15. II □ 

6 Proof of the main theorem 

In this section we prove Theorem 13.91 For the convenience of the reader, we restate the theorem. 

Theorem 13.91 (restated). Let C = {L\, . . . , L m } be a system of linear forms of Cauchy-Schwarz 
complexity at most p. Let d > 0, and assume that L d+1 is not in the linear span ofL d + 1 ,...,L d + 1 . 
Then for every e > 0, there exists 8 > such that for any functions fx, . . . , f m : F™ — > ID with 
\\fi\\u d+1 — we nave 



E Xe(F£) fe 



Ufi(LiCX)) 



<£, 



where X £ (F™) fe is uniform. 
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If d > s, then Theorem 13,91 follows from Lemma 13.21 Thus we assume d < s, hence also d < p. 
We will assume throughout the proof that p, m, s, d, e are constants, and we will not explicitly state 
dependencies on them. 

Let r : N — > N be a growth function to be specified later. Let rj > be a sufficiently small 
constant which will be specified later. By Lemma [4. 121 there exists a polynomial factor B of degree 
s, complexity C < C max (rj,r(-)) and rank at least r(C) such that we can decompose each function 
fi as 

fi = hi + h\ 

where hi = E(/j|B) and < rj. Trivially ||/ij||oo < 1 and ||^||oo < 2. We first show that in 

order to bound E [Y\^L 1 /j(Lj(X))] it suffices to bound E [n£i hi(Li(X.))] if rj is chosen to be small 
enough. 

Claim 6.1. If rj < #- then 



E 



E 



n^(^(x)) 



i=i 



<e/2, 



where the averages are over uniform X G (Ep) fc . 
Proof. We have 

m m ml i—1 m 

Hf i (L i (x))-Hh i (L i (x))=j2 n^(^( x ))-^-^)( L ^ x ))- n /^ L i( x )) 

*=1 \i=i j=i+l 



i=l 



i=l 



Fix z G [m]. Since the Cauchy-Schwarz complexity of {Li, . . . ,L m } is s, we have by Lemma 
that 

i—1 m 

n^(^(x))-(/ l -/ li )(L l (x)). n /,(^(x)) 

3=1 i=«+i 



E 



< - hiWjjs+i < rj. 



□ 



We thus set rj = and regard rj from now on as a constant, and we do not specify explicitly 
dependencies on rj as well. 

Let {Pi}i<i<c be the polynomials which define the polynomial factor B, where we assume each 
Pi is homogeneous of degree deg(Pj) < s. Since each hi is measurable with regards to £>, we have 
hi{x) = Ti(P\(x), . . . ,Pc(x)) where r, : Fp — >• D is some function. Decompose Tj to its Fourier 
decomposition as 

T i ( y z(l),...,z(C)) = Y, c *,7-e P (5>0>0')j > 



where |cj i7 | < 1. Define for 7 G ¥?, the linear combination -P 7 (x) = ^7=1 'yti)Pj( x )- We can 



express each hi as 
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and we can express 



E 



fi^c^(x)) 



,i=i 



^2 A(7i,...,7„ 

Tl,...,7m6FC 



where 



A( 7 i, • • • ,7m) = [U c ^ E \- e P ( R n(LiW) + ■■■ + P 7m (L m (X)))} . 



(19) 



(20) 



\i=l 



We will bound each term A (71, . . . , 7 m ) by r := t(C) = p mC e/2, which will establish the result. 
Let S = {7 G : deg(P 7 ) < d}. We first bound the terms A (71, . . . ,j m ) with 71 G S. 

Claim 6.2. If the growth function r(-) is chosen large enough, and if 5 > is chosen small enough, 
then for all 71 £ S we have 

l c l,7ll < T - 

Consequently, for all 71 G 5 1 and 72 , • • • , 7m £ F^ we /ioue 

I A (71, . . . ,7 m ) I < r. 

Proof. The bound on |A(7i, . . . ,7m) I follows trivially from the bound on |ci i7l [, since 
|c2 )T2 |, . . . , |c m ,7 m I < 1. To bound |ci )7l |, note that 

c 1)71 =E[h 1 (X)e p (-P 11 (X))] - J2 c 1 , Y E[e p (P Y (X)-P 11 (X))] , 

7'eFC, 7 Y7i 

where the averages are over uniform X G F™. We first bound E [hx(X)e p (— P 7l (-X - ))]. Using the fact 
that hi = E(fi\B) and that the function e p (— P 7l (x)) is immeasurable, we have by Observation 14.51 
that 

|EMX)e p (-P 71 P0)] I = |E[/!(X)e p (-P 7l (X))] | < ||/i||^ +1 < 5. 

Hence, by choosing 5 < p-' mC ^Mr(-)) £ /4 we guarantee that |E [7ii(X)e p (-P 7l (X))} \ < 5 < r/2. 
Next for 7' 7^ 71, we bound each term E [e p (P 7 /(X) — P 7l (X))] by rp~ c /2. Assume that for some 
l' 7^ 7i w e have 

|E [e p (P Y (X) - P 7l (X))] I > rp- c /2. 
Then by Theorem 14.101 we have that 

rank(Py - P 7l ) < r p , s (rp- c /2) = n(C). 

Thus, as long as we choose r(C) > r\{C) for all C G N, we have that 

2 |d,yE [e p (Py(X) - P 71 (X))] I < r/2 
7V71 



and we achieve the bound |ci i7l | t. 



□ 



Consider now any 71 ^ 5". We will show that if we choose r(-) large enough we can guarantee 



that 



|E [e p (P 71 (Li(X)) + . . . + P 7m (L m (X)))] I < t, 



(21) 
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which will establish the result. Assume that this is not the case for some 71 ^ S and 72, ... , 7 m G Wp . 
By Lemma 15. II there is a rank r P)S (r) = ^(C) such that if we guarantee that r{C) > ^(C) for all 
C G N and if (|2ip does not hold, then we must have 

P 71 (Li(x)) + . . . + P 7m (L m (x)) = 0. (22) 

Let t = deg(P 7l ) > d. Let P 7 ^ be the degree t homogeneous part of P 7 . Since the degrees of the 
polynomials are at most p — 1, we must have that 

P«)(L 1 (x)) + ... + pW(X m (x)) = 0. (23) 

The following claim concludes the proof. It shows that if (123(1 holds, then L\ is linearly dependent 
on L 2 , . . . , -^m- This immediately implies that also L^ +1 is linearly dependent on L^ +1 , . . . , Lf^ 1 
(since t > d + 1) which contradicts our initial assumption. 

Claim 6.3. Let Pi, ... , P m 6e homogeneous polynomials of degree t < p, where P\ is not identically 
zero, such that 

Pi(L!(x)) + ... + P m (L m (x)) = 0. 
Then L\ is linearly dependent on L 2 , . . . , L l m . 

Proof. Let M(x) = x^ ■ . . . ■ Xi t be a monomial appearing in P\ with a nonzero coefficient a\ ^ 0. 
Let c%i be the coefficient of M{x) in Pj for 2 < i < m. We have that 

aiM(Li(x)) + . . . + a m M(L m (x)) = 0. 

Let x = (xi, . . . , Xk) and Lj(x) = A^iXi + . . . + Xi^k- We have 

t 

M(Li(x)) = fJ(Aj 5 iXi(ij) + . . . + \i,kXk{ij))- 
i=i 

Consider the assignment = (z(i), . . . , z(i)) where z(l), . . . , z(k) £ ¥ p are new variables. We thus 
have the polynomial identity 

m 

CH(X iA z(l) + ... + Xi^zik)) 1 = 0, 

which as t < p is equivalent to 

X)aiLf = 0. 

8=1 

□ 

7 Summary and open problems 

We study the complexity of structures defined by linear forms. Let C = {L\, . . . , L m } be a system 
of linear forms of Cauchy-Schwarz complexity s. Our main technical contribution is that as long 
as s < p if is not linearly dependent on L% + \...,L^+\ then averages Ef^™^ /j(Lj(X))] are 
controlled by ||/i||[/d+i. 
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When the first version of this article was submitted, Theorem 14.31 in the case of p < d was still 
unknown. However as we mentioned in Section [4.11 very recently Tao and Ziegler [15] established 
this case. With the set of techniques of the present paper and [15], it seems plausible that The- 
orem 13.91 can be extended to Conjecture 17.11 below which does not require any conditions on the 
Cauchy-Schwarz complexity of the system. We leave this for future work. 



Conjecture 7.1. Fix a prime p and d> 1. Let jC 



F p such that Lf +1 is not linearly dependent on L\ 



- {L\, . . . ,L 

-1 rd+l 
, . . . , u m 



5 > such that if fa : ¥ r ' 



are functions such that \\f 



l\\U<H 



m} be a system of linear forms over 
Then for every e > there exists 
< 5 then 



< e. 



Li=l 
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